Detecting Negated and Uncertain Information in Biomedical and Review Texts

نویسنده

  • Noa P. Cruz Díaz
چکیده

The thesis proposed here intends to assist Natural Language Processing tasks through the negation and speculation detection. We are focusing on the biomedical and review domain in which it has been proven that the treatment of these language forms helps to improve the performance of the main task. In the biomedical domain, the existence of a corpus annotated for negation, speculation and their scope has made it possible for the development of a machine learning system to automatically detect these language forms. Although the performance for clinical documents is high, we need to continue working on it to improve the efficiency of the system for scientific papers. On the other hand, in the review domain, the absence of an annotated corpus with this kind of information has led us to carry out the annotation for negation, speculation and their scope of a set of reviews. The next step in this direction will be to adapt it to this domain for the system developed by the biomedical area.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Negation and Speculation Target Identification

Negation and speculation are common in natural language text. Many applications, such as biomedical text mining and clinical information extraction, seek to distinguish positive/factual objects from negative/speculative ones (i.e., to determine what is negated or speculated) in biomedical texts. This paper proposes a novel task, called negation and speculation target identification, to identify...

متن کامل

Hedge Classification in Biomedical Texts with a Weakly Supervised Selection of Keywords

Since facts or statements in a hedge or negated context typically appear as false positives, the proper handling of these language phenomena is of great importance in biomedical text mining. In this paper we demonstrate the importance of hedge classification experimentally in two real life scenarios, namely the ICD9-CM coding of radiology reports and gene name Entity Extraction from scientific ...

متن کامل

Uncertainty Detection in Hungarian Texts

Uncertainty detection is essential for many NLP applications. For instance, in information retrieval, it is of primary importance to distinguish among factual, negated and uncertain information. Current research on uncertainty detection has mostly focused on the English language, in contrast, here we present the first machine learning algorithm that aims at identifying linguistic markers of unc...

متن کامل

Information Extraction of Texts in the Biomedical Domain

Automatic detection of relevant terms in medical reports is useful for educational purposes and for clinical research. Natural language processing techniques can be applied in order to identify them. The main goal of this research is to develop a method to identify whether medical reports of imaging studies (usually called radiology reports) written in Spanish are important (in the sense that t...

متن کامل

Speculation and negation annotation in natural language texts: what the case of BioScope might (not) reveal

In information extraction, it is of key importance to distinguish between facts and uncertain or negated information. In other words, IE applications have to treat sentences / clauses containing uncertain or negated information differently from factual information that is why the development of hedge and negation detection systems has received much interest – e.g. the objective of the CoNLL2010...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013